Pronunciation Clustering and Modeling of Variability for Appearance-Based Sign Language Recognition

نویسندگان

Morteza Zahedi

Daniel Keysers

Hermann Ney

چکیده

In the domain of sign language recognition from video, most approaches try to segment and track the hands and head of the signer in a first step and subsequently extract a feature vector from these regions [1, 2]. Because of possible occlusions between the hands and the head of the signer, noise, or brisk movements, segmentation can be difficult. Many approaches therefore use special data acquisition tools like data gloves, colored gloves, or wearable cameras. Furthermore, the words and phrases of sign language are expressed differently by different signers. Sometimes there are two or three different pronunciations for one word. The pronunciations differ in the visual appearance. In this work, we introduce a database of video streams for American sign language word recognition. The utterances are extracted from a publicly available database and can therefore be used by other research groups. This database, which we call ‘BOSTON50’, consists of 483 utterances of 50 words. One important property of this database is the large variability of utterances for each word. This database is therefore more difficult to recognize automatically than databases in which all utterances are signed uniformly. So far, this problem has not been dealt with in the literature on sign language recognition. To overcome these shortcomings we suggest the following novel approaches:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust appearance based sign language recognition

In this work, we introduce a robust appearance-based sign language recognition system which is derived from a large vocabulary speech recognition system. The system employs a large variety of methods known from automatic speech recognition research for the modeling of temporal and language specific issues. The feature extraction part of the system is based on recent developments in image proces...

متن کامل

Perception and Synthesis of Biologically Plausible Motion: From Human Physiology to Virtual Reality

Temporal measures of hand and speech coordination during French cued speech production p. 13 Using signing space as a representation for sign language processing p. 25 Spatialised semantic relations in French sign language : toward a computational modelling p. 37 Automatic generation of German sign language glosses from German words p. 49 French sign language processing : verb agreement p. 53 R...

متن کامل

Mandarin Pronunciation Modeling Based on Cass Corpus1

The pronunciation variability is an important issue that must be faced with when developing practical automatic spontaneous speech recognition systems. In this paper, the factors that may affect the recognition performance are analyzed, including those specific to the Chinese language. By studying the INITIAL/FINAL (IF) characteristics of Chinese language and developing the Bayesian equation, w...

متن کامل

Modeling of Pronunciation, Language and Nonverbal Units at Conversational Russian Speech Recognition

The main problems of a conversational Russian speech recognition system development are variability of pronunciation, free word-order in sentences and presence of speech disfluencies. In the paper, pronunciation variability is modeled by creation of multiple word transcriptions. A syntacticstatistical language model that takes into account long-distant word dependencies is proposed for Russian ...

متن کامل

Modeling pronunciation variation using context-dependent weighting and b/s refined acoustic modeling

The pronunciation variability is an important issue that must be faced with when developing practical automatic spontaneous speech recognition systems. By studying the initial/final (IF) characteristics of Chinese language and developing the Bayesian equation, we propose the concepts of generalized initial/final (GIF) and generalized syllable (GS), the GIF modeling method and the IF-GIF modelin...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Pronunciation Clustering and Modeling of Variability for Appearance-Based Sign Language Recognition

نویسندگان

چکیده

منابع مشابه

Robust appearance based sign language recognition

Perception and Synthesis of Biologically Plausible Motion: From Human Physiology to Virtual Reality

Mandarin Pronunciation Modeling Based on Cass Corpus1

Modeling of Pronunciation, Language and Nonverbal Units at Conversational Russian Speech Recognition

Modeling pronunciation variation using context-dependent weighting and b/s refined acoustic modeling

عنوان ژورنال:

اشتراک گذاری